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ABSTRACT 



The image motion analysis algorithms that generate the two dimensional velocity of 
objects in a sequence of images are developed. The algorithms considered consist of: the 
parallel extended Kalman filter method; the spatiotemporal gradient methods; the 
spatiotemporal frequency methods; and the one-dimensional FIT methods. These algorithms 
are designed to perform on low signal to noise ratio images. Each of these algorithms is 
applied to a sequence of computer generated images with varying signal to noise ratios. 
Simulations are used to evaluate the performance of each algorithm. 
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I. INTRODUCTION 



The measurement of displacement between successive frames in an image sequence 
gives information about the motion (or velocity) of the objects in these images. The 
motion information is an important element in the source coding, restoration, analysis 
and interpretation of time-varying imagery, since a time-varying scene can be 
characterized to a large extent by the motion it contains. The characterization may lead 
to detection of objects (targets, their velocities and trajectories), it can be used to 
distinguish between scene and noise components in order to reduce additive noise in 
sequences and furthermore can be used as a cue for image segmentation. Motion 
imformation may also be useful for the compact representation of the image sequences 
for transmission and storage purposes (as in motion-compensated interframe coders). The 
motion patterns in the image sequence may be due to the motion of objects in the scene, 
the motion of the observer or both. Also, the motion may be apparent motion where a 
change in the image intensity between frames gives the illusion of motion. 

There are 4 basic approaches for motion estimation: 

(1) feature correspondence (matching), 

(2) difference picture methods, 

(3) spatio-temporal gradients, 

(4) Fourier phase changes. 

Matching methods use algorithms for matching the components of objects between two 
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successive frames of a scene containing moving objects. This leads to an optimization 
process which can easily be solved. The drawback to this approach is that identifiable 
features must be derived from the image before matching can be tried. Also the spatial 
range over which image flow is derived is short in comparison to the size of the region 
in which identifiable features exist. Difference picture methods start with a pixel-wise 
difference between successive frames in an image sequence. The differences can be 
thresholded to produce a binary motion image. Since difference picture methods do not 
take the direction or speed of the motion into account, the methods are not easily applied 
when the background is moving and can not easily differentiate between different objects 
moving with different velocities. The problem with difference picture methods is that 
they are local and do not combine spatial and temporal changes over neighborhoods to 
improve the information on the velocity field. Spatio-temporal gradient methods relate 
the spatial and temporal gradient at each pixel in the image plane to the instantaneous 
velocity at that pixel point. This forms a two-dimensional vector field called the 
displacement field (velocity field or optical flow). The Fourier-phase methods utilize the 
shift property of Fourier transform and generate a solution based on the spatial-frequency 
domain data. 

In this thesis, some of the current motion estimation algorithms are introduced, 
discussed, and improved. A comparison between algorithms is established. Chapter II 
includes the application of the parallel extended Kalman filter. The Extended Kalman 
Filter is applied to sequential images containing a moving object to estimate object’s two- 
dimensional velocity components and to increase image quality by the reduction of noise 
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for low signal to noise images. The idea of using an extended Kalman filter (EKF) to 
enhance the quality of the image frame and provide an estimate of the object velocity was 
proposed by Burl [Ref. 1]. Chapter III includes the algorithms that utilize the Image 
Flow Constraint equation which relates the spatial changes in an image frame to the 
temporal changes between the frames. Several ways for solving this equation are 
discussed. Chapter IV and V address the Fourier transform methods of motion 
estimation. These relate phase changes in the Fourier domain representation of image 
frames to the velocity of objects contained in the image frames. The algorithms discussed 
in this thesis are simulated under different background and noise conditions to evaluate 
their performance. Chapter VI includes the simulation results and the final comments. 

In this thesis, a moving image model is defined as a series of image frames 
containing a moving object. An image frame consists of a two-dimensional array where 
the individual array elements are defined as pixels. Each pixel is given a numerical value 
based on the intensity of the image at that point. 
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n. THE EXTENDED KALMAN FILTER 



A. INTRODUCTION 

An extended Kalman filter (EKF) may be developed for the estimation of both the 
image and velocity components from a sequence of images in a nearly optimal manner 
[Ref. 1]. First the EKF requires a dynamic model that describes the evolution of the 
sequential image frames. A simple dynamic model consists of a shift operator in the 
spatial domain which describes the motion of the object. Unfortunately the EKF 
developed from this model requires a prohibitive number of computations. This difficulty 
is overcome by transforming both image and EKF into the spatial frequency domain. The 
two-dimensional discrete Fourier transform is used to transform these two-dimensional 
image frames from the spatial domain to the spatial frequency domain. By transforming 
the image frame to the spatial frequency domain, the shift operator in the spatial domain 
becomes a phase shift operator in the spatial frequency domain. The shift of each spatial 
frequency is given by the scalar product of the velocity and the frequency. The EKF cah 
be applied to each of the spatial frequencies separately. The number of calculations can 
then be reduced by limiting the number of spatial frequencies that are analyzed. 
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B. THE MOVING IMAGE MODEL 



1. The Moving Image Model in the Spatial Domain 

The evolution of successive image frames containing a moving object is 
modeled by a nonlinear state equation in the spatial domain. The background of the 
image is assumed to be zero to simplify the initial presentation. A new model for the 
images with background will be presented later. Also the moving object is assumed to 
be completely contained in the image. So the state-space equation is defined as, 
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v(jt+l) 




0 7 
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( 2 . 1 ) 



= n(v) 

where x(k) is the state vector of the system, /fXrj G is the NxN pixel image at time 
k stacked into a vector of the amplitudes of each pixel, v G 7?^ is the velocity vector 
(consists of a two-element column vector describing the horizontal and vertical velocities 
of the moving object in terms of pixels per sampling time) and S(v) is a two-dimensional 
shift operator with magnitude and direction given by v. The vectors /and v combine to 
form the state vector x G Note that object motion is equivalent to a shift operator 

applied to the entire image since the background is zero. Also the shift operator is a 
circular shift operator since the object is assumed to remain within the boundary. 

Equation 2. 1 describes a sequence of identical replications of an object as it 
moves accross an image frame with constant velocity. A more realistic model would be 
to include changes in the object due to rotations, change in the aspect angles etc. 
Additionally, changes in the object’s velocity may occur. A new model which includes 
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these changes may be developed as, 



JC(it+l) 



5(v) 0 
0 I 



x(k) ^ 
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( 2 . 2 ) 



= a(v) x(k) -K C(^), 

where the plant noise is composed of the image plant noise, ^ and the velocity plant 
noise, i). The plant noise is assumed to be a zero-mean, white, Gaussian random noise 
with a covariance matrix defined as. 
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Cv 
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(2.3) 



where E[*] is the expectation operator and I is the identity matrix. The measurement of 
the image, yOc), is defined by the corresponding measurement equation, 

y(k) =fik)+n(k) = [/ 0]x(it)+n(Jt) = C x(k)+n(k), (2-4) 

where y E is the measured image and n(k) E is the measurement noise which 
is assumed to be a zero-mean, white, Gaussian random process with a known covariance 
matrix. 



^nn = ^ ["(^) ^ ^(^)] = 



(2.5) 



2. Derivation of the Shift Operator in the Spatial Frequency Domain 

The two-dimensional image frame defined as f(n,,ri 2 ) where 0 < n, < N-1 
and 0 < TI 2 < N-1 is transformed from the spatial domain to the spatial frequency 



6 



domain by using the two-dimensional discrete Fourier transform (DFT), 






N-\ N-\ 



nj=0 






( 2 . 6 ) 



where 0 < k, < N-1 and 0 < ^2 < N-1. 

The shift operator in the spatial frequency domain is derived from the circular 
shift property of the discrete Fourier transform [Ref. 2:p. 67]. The circular shift 
property is defined as, 

((n,-'"i)W("2-'»2»N ) - 

where ((•)) is the modulus operator, ,and D(m„ni 2 ) is the transformed shift 

operator. Equation 2.7 indicates that a circular shift in the spatial domain results in a 
phase shift in the spatial frequency domain. The original image frame, is 

assumed to be a periodic two-dimensional sequence with the fundamental period equal 
to the frame dimensions. The object moving across the image frame then describes a 
circular shift [Ref. 2:pp. 61-62]. This implies that, as the object moves off one edge of 
the screen, it reenters the screen from the opposite side. Real image frames, such as a 
radar screen or video display, do not require this property because the moving object’s 
size will likely be much smaller than the screen dimensions. 

The derivation of D(m„mJ begins by introducing a circular shift into the 
original image frame, Equation 2.6 now becomes. 
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( 2 . 8 ) 
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where F’(k„k 2 ) is the circular- shifted, transformed image frame. The amount of pixel 
movement is described by m, for the horizontal shift and Wj for the vertical shift. Letting 
l,=n,-m, and I 2 =n 2 -nt 2 Equation 2.8 becomes, 



where FOc^k^) is the transformed image frame without the circular shift. This proves that, 
when an object moves across an image frame in the spatial domain, the motion is 
described by a phase shift in the spatial frequency domain, thereby agreeing with the 
circular shift property (Equation 2.8). 

3. The Moving Image Model In The Spatial Frequency Domain 



the spatial frequency domain. The transformation is accomplished by taking the discrete 
Fourier transform of the image portion of these equations which gives. 




N * N 



(2.9) 



m, 



Rearranging terms. Equation 2.9 reduces to. 



F%Jc^) = e Fik^Jc^), 




( 2 . 10 ) 






The spatial domain model described by Equation 2.3 can be transformed to 
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( 2 . 11 ) 
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X(it+1) = ^(v) X(k) + Z(k) 
= a(X(k);zm 

where, 



F(k) = ^{fik)} G 

z^k) ^^{Cjik)} € C^^ (2.13) 

D(v) = diagonal [ ]. (2-14) 

and a(*,*) is a non-linear operator. The vector, /fX:) and its corresponding plant noise, 
^/k) are transformed to the spatial frequency domain, where is the two-dimensional 
discrete Fourier transform operator (Note that the index k refers to the sample at time 
k). The transformed shift operator, D(v) is a diagonal matrix where the diagonal terms 
Di(v) are the transformed shift operators for specified spatial frequencies so that the 
corresponding elements co, of spatial frequency vector co is defined as. 



0 ),. = 



N 

2n^i 

N 



(2.15) 



The image plant noise remains a zero-mean, white Gaussian random process since it is 
generated by a linear operation; 
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= E[zz^ = 



a}I 0 



(2.16) 



0 oil 

Additionally, note that only half of the spatial frequencies need to be stored in F(k) due 
to the conjugate symmetry property of discrete Fourier transform of a real image [Ref. 
3:p. 45]. 

The corresponding measurement equation in the spatial frequency domain is: 



Y(k) = F(k) + N(k), 

= [I 0] X(k) + N(k), (2.17) 

= C X(k) + N(k), 

where Y(k) = ^y(k)} and N(k) — ^v(k)), and N is z. zero-mean gaussian with known 
covariance matrix, 



R^ = E[mN\k)] =oll. ( 2 . 18 ) 

In practice, the image is measured and then the Fourier transform is performed. But for 
simplicity, we will assume that the Fourier transform of the image is measured. The state 
equations given by Equations 2.11 through 2.18 are the basis for implementing the EKF 
in the spatial frequency domain. The Equations 2.11 through 2.18 define a nonlinear state 
space model of a sequence containing a moving object. The state of this system consist 
of both the image (in the spatial frequency domain) and the velocity of the moving 
object. The model is driven by white noise. Measurements consist of the image in the 
spatial frequency domain corrupted by additive noise. The extended Kalman filter has 
found wide application as a state estimator for systems of the form described above. 
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C. THE EXTENDED KALMAN FH.TER 



1. The Extended Kahnan Filter 

The extended Kalman filter is a nonlinear, recursive filter that can be 
employed to generate estimates of a system whose evolution is governed by a nonlinear 
state equation. The extended Kalman filter equations for the moving image model are: 
State Prediction: 

X(k+l\k) = a(X(k\k),0), (2.19) 

Y(k+l\k) = CX(k-^l\k), (2-20) 

Covariance Prediction: 

P(^+ll^) = A(X(k\k))P(k\k)A\X(k\k)) f Q^, (2.21) 

Kalman Gain: 

G(k^l) = P(k+l\k)C^[CP(k+l\k)C^ + (2.22) 

State Correction: 

X(A:+1 |A:+1) =X(A:+1 1^) + G(/t+l)[F(A:+l) [A:)], (2-23) 

Covariance Correction: 

P(A:+11A:+1) = [7-G(A:+l) C]P(A:+1 |Jt). (2 24) 

The denotes an estimate and the notation k+l\k is read as: at time k+1 given data 
through time k. These equations are a set of recursive equations that can be used for 
estimating the state X(k). 
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2. The Modified Extended Kalman Filter 



As mentioned earlier, the diagonal properties of the transformed shift 
operator, D(v), and the covariance matrices are the basis for the parallel structure shown 
in Figure 2.1. This parallel structure is referred to as the modified extended Kalman 
filter. Reference 1 outlines how the EKF converges to the MEKF as the velocity estimate 
approaches the actual velocity. Practical implementation of the MEKF suggests that each 
filter in the parallel bank be dedicated to a specific spatial frequency. Each individual 
filter is referred to as a single frequency extended Kalman filter (SFEKF). The state 
vector for a specific spatial frequency is defined as. 





X.j(k) 




Re(F,(k)) 


X,(k) = 


X,,(k) 


= 






X^(k) 




T 

CO,V 



(2.25) 



where the first two states, Xi(k) and X 2 (k) are the real and imaginary parts of Fourier 
coefficients associated with a specific spatial frequency and the third state, X^fk), is the 
product of the spatial frequency vector and the velocity vector. Xijfk) is called the 
velocity-frequency product for each frequency. Equation 2.11 becomes. 



Re(F,(k^l)) 




T • T 

cos((o,v) sm(co,v) 0 


Re(F,ik)) 


Im(F,(k*D) 

T 


= 


T T 

-sin((i),v) cos(to,v) 0 


Im(F,(k)) 

T 


(*)jV 




0 0 1 


COjV 



where v contains the two velocity components as. 
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V 



(2.27) 



The non-linear state matrix of Equation 2.26 can be linearized around the current 
estimate of the state by using the Jacobian operator, 



Aim\k)) = 



da 

dX 



The linearized state equation for a single frequency is, 



X.(^+l) = A.(X,(k)) X,(k) Ar 



where. 



(2.28) 



(2.29) 



^A(^)) = 

cos(4(/:)) sin(4(/:)) [-4(fc)sin(4(^))H-4(A)cos(4(^))]| (2.30) 

-sin(4(fc)) cos(4(^)) [-X,,(A:)cos(4(^))-4(^)sin(4(A0)] • 

0 0 1 

The single frequency model (Equation 2.29), the measurement equation (Equation 2.20), 
the linearized state matrix (Equation 2.30) combine with Kalman filter equations to fully 
specify the single frequency EKF’s. Application of these equations yields estimates of the 
real and imaginary components of the Fourier coefficients and the frequency-velocity 
products. 
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D. VELOCITY ESTIMATION 

The single frequency EKF’s each yield an estimated frequency-velocity product. 
These estimates can be combined to yield an estimate of the velocity of the moving 
object. This final estimate is computed by using a weighted least-squares algorithm. 
Unfortunately this estimation algorithm is complicated by an ambiguity of multiplies of 
2x in the frequency-velocity product estimates. This ambiguity is distinct from the 
problem of phase unwrapping since the single frequency EKF’s are free to converge to 
any of the possible multiplies and do not have the rigid structure to the phase unwrapped 
signal [Ref. 1]. By including these ambiguity terms the frequency-velocity estimates can 
be modelled by, 

X^(k\k) = flv + H’+27tm, (2.31) 

where, 

^ (2-32) 

N 






(2.33) 



and 






(2.34) 



The term w is a zero-mean, Gaussian random vector with the estimated covariance. 



E = £[H'W^ = Jiflg[Pj33,P233, -^^ 233 ] 



(2.35) 



where is the 3,3 component of the ith estimation error covariance matrix computed 
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by each of the SFEKF’s. As stated before, the frequency-velocity product enters the state 
equation only as a phase term and is ambiguous with respect to a shift of a multiple of 
2ir. The term m is an unknown integer vector which models this ambiguity. 

The estimates from Equation 2.31 can be used to estimate the velocity by finding 
the real vector v and integer vector m, that minimizes the weighted sum of the squared 
errors, 

J=[X^(k\k)-Clv-2nm)^ -Qv-lirw). (2-36) 

The vectors v and m that minimize Equation 2.36 can be found by performing a search 
over all possible values of m. For each value of m, finding the value of v that minimizes 
Equation 2.36 is equivalent to the classical Weighted Least Squares problem. A new data 
vector may be defined as, 

X = Xj (k\k) - iTzm, (2.37) 

and the sum of the squared error is, 

J = (X-Qv/S-^^-fiv). (2.38) 

Constraining the velocity estimate to be a linear function of the data yields, 

V = G^X^(k\k), (2.39) 

where, 

(2.40) 

Substituting Equation 2.39 into Equation 2.36 yields the sum of the error as a function 
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of m, 

J{m) = (X(jk|Jt)-27tm)^S-‘(/-OG^(;f3(jb|;b)-27r»i). (2.41) 

Using Equation 2.41, m can be estimated using a search of all possible vectors m to find 
the one that minimizes J(m). 

A simplification to this search for the optimal vector m is developed by Burl [Ref. 
l:p. 12]. The values of the m's will be zero for low spatial frequencies. So by using this 
subset of frequency components, the linear estimator of Equation 2.39 yields This 
velocity estimate is used to estimate the m,'s that do not meet the condition m,=0 by, 

m, = Round(^[X^(k\k)-(^Jv^), (2.42) 

2tz 

where Round(*) equals the integer nearest to •. This estimate, in turn, is used to estimate 
the velocity using Equation 2.37 and 2.39. The estimated values of m are included in the 
frequency-velocity product estimates that are fed back to single frequency EKF’s. 
Therefore, once the filter has converged the values of m will all be zero. 

E. IMAGES WITH NONZERO BACKGROUND 

The background in most image sequences will be non-zero. A method of modifying 
the MEKF for use in this case is proposed by Burl [Ref. 1]. This method proposes that 
2 more states are added to the single frequency EKF’s. The additional states are defined 
as the real and imaginary portions of the background in the spatial frequency domain 
and, 
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x,m = 









X,,{k) 




Im(F.(k)) 


X^ik) 


= 


ReiF^ik)) 


X,/k) 




IrniF^m 


x^(k) 




<^fv(k) 



(2.43) 



where F^(k) is the Fourier transform of the stationary background. The state transition 
matrix for these new states are simply the identity matrix since the background is 
considered to be constant. The state matrix of Equation 2.30 then becomes, 



= 

cos(Xjj(fc)) sinCX-^C/:)) 0 0 

-siniX^m cosiX^m 0 0 
0 0 10 
0 0 0 1 
0 0 0 0 



[-X./fc)sin(4(fc))+4(fc)cos(4(fc))] 
[-X,j(k)cos(X^m -X^(k)srn(X^m] 
0 
0 
1 



(2.44) 



In this case the measurement is modeled as the sum of the moving object state, the 
stationary background state, and the additive noise. 
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Figure 2.1 The Modified Extended Kalman Filter. 
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m. SPATIO - TEMPORAL GRADIENT APPROACH 



A. THE IMAGE FLOW CONSTRAINT EQUATION 

1. The Image Flow Constraint Equation 

This method for computing the velocity field (optical flow) employs the first- 
order spatial and temporal differentials of a time varying image to estimate the 
component of motion at each point. We can derive an algorithm that relates the change 
in image brightness at a point to the motion of the brightness pattern. Let the image 
brightness at a point fx,y) in the image plane at time t be denoted by E(x,y,t). Now 
consider that the pattern moves. The brightness of a particular point in the pattern is 
constant, so that, 

— = 0. (3.1) 

dt 

Using the chain rule for differentiation we see that, 

^ _ Q (3 2) 

dx dt By dt dt 

(A more detailed derivation with Taylor series expansion is included in Appendix A). 
This equation can simply be written as, 

u + V + = 0, (3.3) 

where. 
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u = 



dx 

dt' 



dy 

dt 



(3.4) 



- BE - BE - BE 

9 9 * 

* Bx ^ By * Bt 



The partial derivatives E„ E^, E, are estimated from the image intensity values and will 
be formulated later. The constraint given by Equation 3.2 on the local flow velocity is 
illustrated in Figure 1 [Ref. 3:pp. 20-46]. 




Figure 3.1. The locus of points satisfied by the image flow constraint equation defines 
a line in the velocity space. 
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This constraint is often called as the image flow constraint equation since it expresses a 
constraint on the components u and v of the image flow. As seen in Figure 3.1, values 
of (u,v) satisfying the constraint equation lie on a straight line in velocity space. Two 
conditions are required to ensure the validity of the image flow constraint equation; (1) 
the change in image irradiance at each point must be entirely due to the motion of the 
image pattern, as opposed to changes in the pattern due to reflectance effects, and (2) the 
image should be smooth except at a finite number of discontinuities [Ref. 4:pp. 185-189]. 
Surface based smoothing may be used to improve the velocity estimates for images with 
discontinuities. These conditions are probably too restrictive to allow image flow to be 
useful, but the image flow equation is obeyed in practice with sufficient accuracy for 
tasks such as segmentation. 

The image flow can not be computed at a point in the image independently 
of neighboring points without introducing additional constraints. This limitation occurs 
because the velocity field at each point has two components while the change in image 
irradiance at a point due to motion yields only one constraint. Estimation approaches for 
this problem, in one manner or another introduce additional constraints that allow 
mathematical solutions. One solution to this dilemma is to assume that in the local image 
region there are two values of E(x,y,t) that have the same image motion, i.e. adjacent 
pixels exhibit similar image motion. Defining these two points as E, = E(x,,y„t) and 
E 2 = E(X 2 ,y 2 ,t)y we can write the set of two equations. 
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a£, 

1^' 



u + 



3£, 

dy 



V + 



(3.5) 



or the matrix equation, 



3£i 

"aT 



a£j 

"iT 



■« + 



a£j 



•V + 



a£j 



= 0 , 
= 0 , 



= (3.6) 

where u = [u vf, and the matrix S is formed with the appropriate values from the 
equations above. Inverting 5 yields [u vf [Ref. 5:pp. 230-238]. However, any errors in 
the spatial and temporal derivative estimates may have severe effect on the velocity 
estimates. Furthermore there is no guarantee that S is invertible. The local motion 
homogeneity, although perhaps valid for image points corresponding to object points that 
are close together on the moving object, is not a valid assumption when adjacent image 
points correspond to the motion of independently moving and occluding objects. 

An alternative approach is to explicitly formulate a local smoothness 
constraint on the velocity vector, that is to assume that objects of small size are 
undergoing rigid motion and deformation. In this case neighboring points on the objects 
have similar velocities and the velocity field of the brightness patterns in the image varies 
smoothly almost everywhere. Discontinuities in flow can be expected at the object 
boundaries and also where one object occludes another. This may be formulated 
mathematically as the additional constraint which is to minimize the sum of the squares 
of the Laplacians of the x and y components of the flow. The Laplacians of u and v are 
defined as. 
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(3.7) 



and W = 

dx^ dy^ dx^ dy^ 

Now we can introduce the optimization problem. The optimization measure consist of 
two terms: a penalty on the deviation of estimated velocity field from the image flow 
constraint equation and a penalty on the departure of velocity components from 
smoothness (the smoothness penalty was the sum of the squares of the magnitude of the 
gradient of the image flow velocity). These constraints may be written as, 



Minimize, 
subject to. 



= 



ci^u ^ 

• + + +- 



dx^ dy^ 



(3.8) 



u + Ey V + = 0 



The total error to be minimized becomes, 



Cj. = Cj + X t\. (3-9) 

2. Estimating the Partial Derivatives and the Laplacian of Flow Velocities 
We must estimate the derivatives of the brightness from the discrete set of 
image brightness measurements available. The consistency of partial derivatives are very 
important. That is they should refer to the same point in the image at the same time. 
Although there are many formulas for approximate differentiation, an optimum method 
given by Schunk [Ref. 4:p. 190] can be used, which gives an estimate of E^,Ey,E, at a 
point in the center of a cube formed by eight measurements. The relationship in space 
and time between these measurements is shown in Figure 3.2. Each of these estimates 
is the average of four first differences taken over adjacent measurements in the cube. 
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Figure 3.2. The three partial derivatives of image brightness are estimated from the 
average of first differences along four parallel edges of the cube. 



E = — {F. . , ,-E- t+F. , . , .-E. , .. 

~ (3.10) 

E, = — (F. . . , -F. • . +F. ■ , . , -F. . , . 



We also need to approximate Laplacians of u and v. One convenient 
approximation takes the following form, 
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« (u - u) and = (v - v), 



(3.11) 



where the local averages u and v are defined as follows, 

(3.12) 



1/12 


1/6 


1/12 


1/6 


-1 


1/6 


1/12 


1/6 


1/12 



Figure 3.3 The Laplacian is estimated from a weighted average of the values at 
neighboring points. 
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3. Minimization 



The minimization of total error e^is a constrained minimization process which 
has been formulated by Lagrange multipliers in Equation 3.9 (there are several other 
ways). The solution to the Lagrange formulation is given by finding values of « = [u,vf 
that satisfy, 




— = 0 . 

dX 



(3.13) 



Using the calculus of variations [Ref. 6:pp. 107-178] yields the solution as, 



e] u + E^EyV = X^u - 
E^EyU ^ e]v = X^v - Ey E, 



(3.14) 



Using the approximation to the Laplacian introduced previously gives the folowing 
equations. 



{X + E^)u ^ E^E^v = (Xu - E^ E), 
(E^ E^u ^ (X ^ e])v = (Xv - E^ E). 

Solving for u and v gives. 



u =u - (eIU ^ E^E^v * E^ E,)I(X + eI -h E^) 
V = v - (E^E^u * E^v ^ Ey E)KX + e] + e]) 



(3.15) 



(3.16) 



These results yield two solutions for X such that. 
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X = -[E^ v-h£, £,]/(« - u), 

= -[E^ E^ u^e] v^Ey £,]/(v - 



( 3 . 17 ) 



Notice that these equations constrain u and v but return no new information. They also 
yield, 



62 = 0. 

Now we have a set of two equations for each point in the image. It would be 
very costly to solve these equations simultaneously (also with the Lagrange multipliers). 
Instead an iterative method can be developed. We can compute a new set of velocity 
estimates from the estimated derivatives and the average of previous velocity 

estimates (Wy) by, 



= « " - £,[£, M" +Ey + £,]/(X + eI + Ey), 
= V" _ EIE^ « " -H E v» + £,]/(X + El + El). 



.n^l 



( 3 . 18 ) 



The algorithm stops when convergence is obtained: that is when, 

u\x,y,t) - u’'-\x,y,t) ^ c,, 
v"(x,y,r) - v"-‘(x,y,r) ^ e,- 



( 3 . 19 ) 



In parts of the image where the brightness gradient is zero, the velocity 
estimates will simply be the averages of the neighboring velocity estimates. There is no 
local information to constraint the apparent velocity of motion of the brightness pattern 
in these areas. Eventually the values in these areas will propagate inward. If the 
velocities on the border of a region are all equal to the same value, then points in the 
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region will be assigned that value too (after a sufficient number of iterations). If the 
values on the border are not the same, it is a little more difficult to predict what will 
happen. In all cases the values filled in will correspond to the solution of the Laplace 
equation for a given boundary condition. 

The best result in most cases is provided if the algorithm is performed a small 
number of times per frame of a test sequence. In the first few frames an image flow 
velocity will be estimated within the boundaries of moving object. The velocity field 
outside the object area will be zero (or very small). As the object moves, the region of 
nonzero velocity field does not move along with the moving object, since there is nothing 
in the algorithm that forces the velocity field to be extrapolated in the direction of 
motion. Since the location of the region of nonzero velocity field is not forced to move 
along with the object, the translating object moves out from under the region of nonzero 
velocity vectors. New nonzero velocity vectors are formed at the new position of the 
object. The algorithm does not eliminate the old region of velocity that trails behind the 
moving object. This creates a 'comet tail’ behind the object. In regions where the 
gradient is zero, the algorithm degenerates into the Laplace equation which smooths the 
velocity field. The region of nonzero velocity vectors that trails behind the moving object 
is also smoothed into a consistent (though obviously wrong) velocity field which persist 
long after the object moves beyond the boundaries of the image. As mentioned before, 
the algorithm cannot work in cases where there are motion boundaries in the velocity 
field, or more than one moving object exist int he scene, since the smoothness constraint 
leads to iterative equations that blur abrupt changes in the velocity field. 
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One may use a smoothing algorithm for improving the velocity field 
estimation. The smoothing algorithm constructs a smooth estimate of the velocity field 
by approximating a surface between step discontinuities. Another algorithm called 
Constraint Line Clustering that utilizes the polar form of the image flow constraint 
equation to estimate the image flow velocity field when there are discontinuities is 
described by Schunk [Ref. 7:pp. 1010-1027]. 



B. CONSTRAINT LINE CLUSTERING 

1. The Polar Form of the Constraint Equation 

The image flow constraint equation defines a line in velocity space as shown 
in Figure 3.1. The line in velocity space that is the locus of possible velocities specified 
by the image flow constraint equation is called the constraint line. The constraint line 
may be uniquely defined by d, the distance of the line from the origin along the 
perpendicular bisector and the angle a with respect to the u axis. The displacement and 
the angle of the constraint line are given by, 



d = 




(3.20) 



a = 



_ I arctan(£j^,£y) if ^ 0; 
1 arctanC-E^j-Ep otherwise. 



( 3 . 21 ) 



To derive the constraint equation in polar coordinates, we first need to note that the 
image flow equation contains the dot product of the gradient of the image irradiance with 
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the velocity vector, 



E^u + EyV E, = VE . (u, v) + (3.22) 

If we define p as the speed of motion and /3 as the direction of motion contained in the 
constraint equation, then the dot product in Equation 3.22 becomes, 

p |V£| cos( a - p ). (3-23) 

Dividing through by the magnitude of the gradient yields the polar form of the image 
flow constraint equation, 

d = p cos( a - p ). (3-24) 

Note that ^ is constrained to be between a-x/2 and a + x/2. Since the displacement d of 
the constraint line from the origin must be nonnegative, the orientation a is reflected 
when E, > 0. The displacement d is the projection of the motion vector onto the line of 
the gradient of image irradiance and is independent of the magnitude or polarity of the 
gradient. 

The polar form of the image flow constraint equation can be used in the 
analysis of algorithms for image flow estimation with motion boundaries since the polar 
form will not contain 5-functions at step discontinuities. The polar form will not contain 
5-functions since the displacement d is the ratio of the magnitude of the change in time 
of the image irradiance to the magnitude of the gradient of image irradiance and this ratio 
remains finite in the limit as the image irradiance becomes an ideal step discontinuity. 

The constraints are sufficient to allow the sampled image sequence to 
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accurately portray the local motion data between motion boundaries, but are not sufficient 
to capture the motion information at motion boundaries. The motion constraints are 
extremely sensitive to the effects of occlusion boundaries and are usually in severe error. 
No practical sampling period can provide useful motion information along a motion 
boundary. An image flow estimation algorithm must be able to work in situations where 
there are motion boundaries in the image. So the algorithm called constraint line 
clustering [Ref. 7:pp. 1010-1027] that estimates the image flow field when the image 
irradiance pattern or the velocity field contains discontinuities is presented in the next 
section. 

2. Constraint Line Clustering 

The constraint line clustering algorithm uses the polar form of the image flow 
constraint equation given by, 

d = p cos( a - p ), (3-24) 

where p(x,y) and ^(x,y) are the speed and direction of motion, respectively. The velocity 
vector (p,^) for the motion at any point in the image must lie along the line in velocity 
space defined by the image flow constraint equation shown in Figure 3.1. The constraint 
line is uniquely defined by the displacement d of the constraint line from the origin and 
the orientation a. of the constraint line. The constraint line displacement has the 
dimensions of speed; it is the minimum speed consistent with the motion constraint. 

Assume that the image intensity E(x,y,t) already incorporates any spatial or 
temporal filtering performed on the image. For example, the image could be smoothed 
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Figure 3.4 The position of the intersection of two constraint lines along one of the 
constraint lines. 

with a Gaussian filter. The image is sampled in space and time to produce an image 
sequence E(Xi,yj,t^). The d and ot intrinsic image arrays d(Xi,yJ and afXijj) are computed 
from the image sequence using the formulas in Equations 3.20 and 3.21. The partial 
derivatives are computed by the differences over a cube in space and time by equations 
given by Equation 3.10. The image flow estimation problem is to compute the intrinsic 
image arrays p(x„yj) and P(x„yj) for the speed and direction of motion from the motion 
measurements in the d and a arrays. The constraint line clustering algorithm uses a 
cluster analysis in one dimension to extract the motion estimate from contradictory data. 
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Suppose that for each d and a measurement, a set of measurements { d„ a, } is taken 
from some spatial neighborhood of the point that generated d and a. Compute the set of 
intersections of each of the neighboring measurements d, and a, with the given d and a 
measurement. All of these intersections lie along the line defined by d and a. Any 
constraint lines in { d„ a, } that are part of the same region of motion as d and a will 
tend to intersect the line defined by d and a in a tight cluster around the true velocity. 
Any constraint lines that are from different regions of different motion will intersect the 
line defined by d and a over a broad range of positions. The incorrect constraint lines 
generated along a motion boundary will not intersect the line defined by d and a at a 
consistent point. 

The formula for the position of intersection along a constraint line can be 
easily derived by first rotating the coordinate system so that the constraint line defined 
by d and a is parallel to the vertical axis in velocity space. Let the angle between the two 
constraint lines be denoted by <f> = a’-a as shown in Figure 3.4. The distance of the 
intersection along the constraint line may be computed by, 

b = — (3.25) 
tan(4>) 

The distance c is related to d, d’ and <i> by, 

( d + c ) cos(4>) = d'. (3.26) 

Using Equation 3.25 to eliminate c from Equation 3.26 yields. 
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d cos(<J)) + b sin (4>) = d'. 



( 3 . 27 ) 



Solving for b and substituting using the definition for <j> yields, 

^ _ d'-d cos(4>) _ d'-d cos(g^-g) ^3 2g) 

sin(4>) sin(a^-a) 

This formula provides the position along a constraint line defined by d and a of the 
intersection of the constraint line with another constraint line defined by d’ and a’. So 
given a set { b, } of intersections of the constraint lines within a neighborhood with a 
constraint line at the center of the neighborhood, the set of intersections may be analyzed 
to determine the most consistent subset of intersections that cluster about the likely 
velocity. The cluster analysis criterion is motivated and explained by imagining a motion 
boundary passing almost vertically through a neighborhood just to the left of a center 
element. At any reasonable pixel resolution, the boundary will most likely pass smoothly 
through the neighborhood dividing the neighborhood almost in half. The region on one 
side corresponds to one surface in the scene and the other side including the center 
element corresponds to a different surface. In the extreme case where the boundary 
passes very close to the center of the neighborhood, almost half of the intersections of 
neighborhood constraint lines with the constraint line from the center of the neighborhood 
will be with constraints from the other side of the surface or the motion boundary itself. 
These intersections will most likely not be close to the velocity of the surface from which 
the center constraint line was obtained. Almost half of the intersections may be useless 
in the most extreme case; but at least half of the intersections will correspond to the same 
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region of motion. Since rejecting correct intersections is less harmful than accepting 
incorrect intersections, a conservative algorithm can be taken: the algorithm looks for the 
tightest cluster that contains roughly half of the intersections. The cluster analysis is done 
in one-dimension which makes it easy. The algorithm sorts the set of n intersections {bj 
and then looks for the tightest interval that contains half of the intersections by examining 
successive pair of intersections that are [n/2] intersections apart. The algorithm chooses 
the pair of intersections that are closest together as the estimate of the majority cluster 
of intersections. Then the center position b' of the tightest cluster along the constraint 
line is the midpoint of the smallest interval that contains roughly half of the intersections. 
So the estimated speed of motion is given by 

^ . V (3.29) 

and the estimated direction of motion is given by, 

P = a + tan’*(fc / d). (3.30) 

The estimated speed and direction of motion (v, |8) computed with constraint line 
clustering avoids the error problem in motion boundaries and gives another solution to 
the image flow constraint equation. Also the constraint line clustering algorithm is very 
robust and works even if the basic assumption that at least half of the intersections must 
be from the same region of motion is violated. For example if the center constraint is 
from just inside a right angle comer so that only 1/4 of the intersections are from the 
same region of motion, the algorithm still works because the intersections from 
constraints outside the comer are almost uniformly distributed along the constraint line 
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and do not bias the estimate of the tightest cluster. 

The constraint line intersections are not always numerically well conditioned. 
When two constraint lines are close in orientation there can be a large error in the 
intersection position. When the constraint lines have the same orientation and different 
displacements, the lines will not intersect. In spite of these objections, the constraint line 
clustering algorithm is robust by the fact that only the tightest group of points are 
retained to compute the motion estimate. Nearly half of the intersection positions will be 
discarded. Any grossly incorrect intersections will most probably be discarded since they 
lie outside the tightest cluster. 

The algorithm is also very insensitive to the neighborhood size. Good results 
may be obtained even for the minimum neighborhood size of 3 by 3. Larger 
neighborhood sizes produce only slightly better motion estimates. There are two reasons 
for this. The first reason is that the accuracy of the motion estimate is limited by the 
accuracy of the constraint line along which the motion estimate is obtained. The second 
reason is that the number of points in the neighborhood may have little impact on the 
quality of the motion estimate; it is the spatial extent of the neighborhood relative to the 
gradient variation that determines the quality of the motion estimate. In order to get good 
intersection measures, motion constraints must be used from beyond the region of similar 
gradient. The motion estimate can be greatly improved by increasing the spatial coverage 
of the neighborhood. One apparent weakness with the constraint line clustering is that it 
depends on the accuracy of the constraint line at the center of the neighborhood. This is 
not a problem with the constraint line clustering algorithm itself but is an instance of a 
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more fundamental problem with assigning a motion estimate to a point in an image. If 
the constraint equation is grossly incorrect, then there is a fundamental ambiguity in 
assigning a motion estimate to the corresponding image location. 

The information in image flow constraint equations could be combined by 
least squares methods. The image flow constraint equation is one constraint with two 
unknowns. The equations in a neighborhood could be solved by least squares methods 
for an initial motion estimate. 
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rv. ONE^DIMENSIONAL FOURIER TRANSFORM FOR TRACKING MOVING 



OBJECTS 

A. INTRODUCTION 

The method presented in this chapter brings a solution to the problem of 
determining motion estimates via a One-Dimensional Fourier transform formulation. The 
technique identifies objects uniquely by their motion using an algorithm which is based 
on the application of the one-dimensional Fourier transform [Ref. 8;pp. 383-388]. 

This method not only separates the time-varying and time invariant parts of an 
image but perform these operations with the least amount of computation. The method 
presented here associates sinusoids with the time-varying parts of the image, such that 
the faster moving objects are associated with high frequency sinusoids and the time- 
invariant parts are associated with constant levels. 

B. THE COSINE-AREA TRANSFORM 

Let a one-dimensional discrete sequence be described by f(x,t), t=0,l,...,T-l of T 
frames of size M. Define a general transformation as, 

8(t) = E MO r=0.l,...,7’-l, (4.1) 

x-O 

where w(x \ itjis a weighing function of order k. A properly chosen weighing function 
will allow us to use this weighted area, g(t), to find the velocity of a moving object while 
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discriminating against a stationary background. 

If w(x I k) = cos(2irkx/M) is chosen as a weighting function, Equation 4.1 
becomes, 

S(0 = S cos(2nJbclM) 

x=0 

So if the signal is multiplied point for point with the amplitude of a sinusoid and 
the result is summed (integrated) over all the spatial variables the result is a single 

value gfii) at each point t,. When the signal is constant with respect to time, gftj will have 
a constant value, i.e. g(tj = A. When the time sequence contains a moving object, gftj 
will trace out a sinusoid [Ref. 9:pp. 281-283]. 

Suppose that for frame one (t=0) the sequence yields a one-dimensional array with 
M entries that are zero except at one location x„ (where the object is). So the weighted 
sum gives gC0J=cosf2Tkx„/MJ. The movement of the object by one location in the next 
frame (t = 1) gives g(l)=cos(2'Kk(x„+l)/M). If the object continues to move one location 
per frame, g(t) yields a sinusoid with frequency of k/M. If the object were moving y, 
locations between frames then the sinusoid would have a frequency vJc/M. Since t varies 
between 0 and T-1 in integer increments, if we restrict k to have integer values then the 
discrete Fourier transform of the sinusoid would have 2 peaks, one located at frequency 
vjc and the other at T-vJc. This latter peak is due to the symmetry foldover of the Fourier 
transform. Thus a peak search in the Fourier spectrum would yield vJc and division of 
this quantity by k yields 

The factor k in the weighting function has several properties and limitations. 
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Fundamentally, 



0 < \k\ < M ! 1, (4.3) 

which is basically determined by Nyquist sampling criteria to prevent aliasing and for 
convenience k is usually a positive integer. A zero k value allows no sinusoid to be 
created by a moving object and a k greater than M/2 produces an aliased spectral 
response. The actual value of k is chosen from the frame rate and the maximum expected 
object velocity, 

where is the frequency limitation given by the available number of frames and the 
Nyquist criteria. The maximum and minimum velocities represent the dynamic range of 
the system, i.e., 

/ ^'min = ^ / 2- (4-5) 

The Fourier transform of g(t) provides the basis for an efficient and effective 
method for estimating the angular frequencies of sinusoids. The Fourier transform of g(t) 
is given by, 

r-i 

Gif) = E sit) /=0,l,...,r-l 

(4.6) 

7-1 M-1 

= E E fi/tf) cosi'^T^kxIM) 

t=0 jr»0 

The result is that the moving object generates a frequency proportional to its velocity. 



40 



All the stationary objects and the background will appear as a zero frequency component 
in this transform domain. The frequencies generated by the moving objects can be easily 
discerned by a simple peak detection algorithm. 

This method is easily generalized for use on two-dimensional spatial images. Let 
the time sequence of images be represented by f(x,y,t) with 0<x<N, 0<y<N, 
0 < t < N. The resulting velocity is a vector v = [ v^v^f. The cosine area transform 
is defined by, 



Mi 



N-i r 

= cos 

jt=0 

y=0 



{inkflN} 



(4.7) 



and to extract velocity components, the following Fourier transformations are taken. 



kwj 










(4.8) 



C. THE GENERALIZED AREA-TRANSFORM 

We will generalize the cosine area transform in a way that will yield us the 
extraction of both velocity components simultaneously and identification of the moving 
objects in the original sequence. Two Fourier transformations are the foundation for this 
generalization. These transformations result from a replacement of the cos(2'Kkx/N) term 
with expi-jlrkx/N). The first transform is a space to inverse-space transform whose 
values are subsequently summed over the second spatial variable; while the second is a 
time to frequency transform [Ref. 9:p. 284]. The transformations for a sequence of T 
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digital images of size N x N, at any integer instant of time are given by, 



N-l N-l 






^(M = E E ^ ^,=o,±i,. 

jc=0 y=0 



(4.9) 



W-I N-l 

^(V) = EEMy.o« ^ ky=o,±h.. 

y*0 jr*0 



(4.10) 



r-i 



-ftnft 



= E ^(M ^ 

t=0 



(4.11) 



T-l 

Gik^S = E ^(^yO « '' /=0,l.....r-l (4.12) 

«=o 

It should be noted that separate transforms are used for each spatial dimension, and the 
first two transforms are summed over both spatial variables. This is done so that the 
velocity components can be resolved for each dimension. The velocity product is found 
by detecting a peak in the spectrum of G(k^J) for a fixed value of K. The velocity of an 
object in the dimension u is then found by dividing the frequency of the peak by k,. Also 
it is of interest to note that a sequence of frames in which no motion takes place would 
yield identical exponential terms whose Fourier transform would consist of a single peak 
at a frequency of 0. Since the operations are linear, the general case involving one or 
more moving objects in an arbitrary static background would have a Fourier transform 
with a peak at 0 (corresponding to the static image components) and peaks at locations 
proportional to the velocities of the objects. 
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POWER SPECTRUM WITH OFT 




FREQUENCY 



Figure 4.1 The ideal power spectrum (G(k„,f)) with DFT. 



The determination of a velocity component requires the use of phase information 
derived from the real and imaginary parts of F(k^J). Since the speed of object was 
directly obtained from the spectrum of G(k^J), the phase information in F(k^,t) is used 
to derive the sign of the velocity. Once a moving object is found, comparing the signs 
of the equations below determines the sign of the velocity; 






d^Re{F{K^t)} 



dt^ 



t 






(4.13) 



^2 



d^Im{Fik^,t)] 



(4.14) 



Since F(k„t) is a sinusoidal, S, and will have the same sign at an arbitrary point in 
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time t, if the velocity has a positive sign. The velocity will be negative if 5, and S 2 have 
opposite signs. It should be noted that the ambiguity resulting when S, or S 2 equals zero 
is avoided by using the closest point in time t = t, ± At. 

The algorithm proposes that the 1-D FFT (DFT) spectrum of F(k^,t) provided by 
Equation 4.11, 4.12 will have a peak proportional to the velocity components of the 
moving object as shown in Figure 4.1. In reality the algorithm yields a wide spectrum. 
This spectrum has "smeared" frequency peaks instead of one frequency peak at the 
desired location as shown in Figure 4.2. 



POWER SPECTRUM WITH OFT 




FREQUENCY 



Figure 4.2 The "smeared" power spectrum (G(k„,f)) with DFT. 

There are mainly 2 reasons for this undesirable result. 

(i) The undesirable signal components in the image frames (noise, occluding boundary 
effects, rotation of object, non-stationary background etc.). These "noise effects" can 
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cause spectral spreading. 

(ii) The loss of frequency resolution due to limited number of frames. 

The first condition has already been mentioned where the model did not include the 
non-stationary and non-uniform background. The effect of noise is always a problem for 
any algorithm. 

The second condition is caused by the situations where the equation = # 

frames/2 is not satisfied due to the lack of available frames. Note that if is taken to 
be 1 pixel per unit time, is limited by the number of frames available. This 
limitation may be overcome by assigning the appropriate value to k^. For the case where 
image size is an integer multiple of the available number of frames, the maximum peak 
occurs at the frequencies integer divident of actual velocity. For other cases, the single 
frequency component at kj4 will be "smeared" out to other frequencies. This smearing 
effect is defined as "leakage" [Ref. 10:pp. 459-474] and may cause errors in velocity 
estimates which also add to the errors due to the noise effects. Also as the leakage causes 
significant nonzero frequency components at the undesired frequencies (starting from 
adjacent frequencies), it also reduces the magnitude of the peak at the desired (or 
accurate) frequency. 

A precise analysis is beyond the scope of the discussion here, but an intuitive idea 
of what is causing leakage may be obtained by looking at the time domain representation 
of sinusoids. The primary source of leakage seen in the FFT (DFT) of these sinusoids 
are the discontinuity introduced in the periodic extensions of complex sinusoid sequences 
by the missing partial cycles in available data. Missing partial cycles of the periodic 
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sinusoids occurs when the image size is not an integer multiple of the available number 
of frames. Note that the best result occur in cases where the number of frames is equal 
to the image size. 

One solution to the problems introduced above is to use Spectral Estimation 
techniques as a last step to extract the frequency information from the available corrupted 
data. Besides the 1-D FFT (DFT), the Maximum Entropy Method can be used to 
improve estimates of peak frequencies. The Maximum Entropy Method (MEM) was 
originally devised by Burg [Ref. 11] to overcome the fundemental limitations of 
Fourier-based methods for estimating the power spectrum of complex sinusoids in the 
presence of additive noise. 
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Figure 4.3 The improved power spectrum with the MEM method. 
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Other spectral estimation techniques such as the Multiple Signal Characterization 
(MUSIC) Algorithm, the Maximum Likelihood Method (MEM) and the Auto Regressive 
(AR) Signal Modelling were considered. The MEM was selected since it has significantly 
better resolution characteristics than the others when not much data is available [Ref. 
12:pp. 385-392]. In particular, the MEM avoids the use of periodic extension of data or 
of the assumption that the data outside the available record length is zero. 
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V. THE 3-D FOURIER DOMAIN ANALYSIS OF IMAGE MOTION 



A. THE SPATIOTEMPORAL FREQUENCY METHOD (STF) 

This approach to image flow derivation takes the advantage of the special form of 
the Fourier transform of time varying images characterizing constant-velocity, rigid-body 
translation. The method is based on the spatio temporal-frequency image representation 
and is called STF (Spatiotemporal-Frequency Approach) [Ref. 13,14]. A time-varying 
image function, moving in a spatial direction determined by the velocity vector [u vf 
may be defined as, 

Ax,y,t) =Ax- ut,y - vt), 

= Ax,y) * Hx - ut,y - vf), 

where f(x,y) = f(x,y,0), 8{x,yJ is the Dirac delta function and denotes convolution. 
The 2-D Fourier transform of Equation 5.1 incorporates a phase term of the form, 

F2-D {^(W)} = 

(5.2) 

where F(w,,wJ is the fourier transform of f(x,y). The existence of this phase component 
in the above equation enables the estimation of image velocity. This phase term is 
linearly dependent on t and the inner product of the 2-D Fourier indices (w, with 
the velocity vector (u v/. Determining the 3-D Fourier transform of Equation 5.1 only 
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requires an extension of the 2-D transform in Equation 5.2. Again assuming constant 
velocity, the 3-D Fourier transform can be denoted by, 






= / ^'2-bWW)) dt, 

— m 

m 

= dt. 



(5.3) 



= F(w^,w^ 6(w^u+WyV+w). 

As shown in Appendix B, this gives a very interesting result that a uniformly translating 
image with velocity [K.vJ^ has a Fourier spectrum that is zero elsewhere in the 3-D 
spatiotemporal space (vy.,Wy,wj except on the single plane defined by, 



{ (w,,Wy,w,) : + v^Wy + w, = 0 }. (5.4) 

In general, for each velocity [Uq,vJ^ Equation 5.4 defines a unique planar locus of 
spatiotemporal frequencies which are consistent with that velocity as shown in Figure 
5.1. Thus F(w^,Wy,wJ may be visualized as a 3-D solid of constant intensities, where the 
slope of this solid surface is given by the values of m<, and Vp. This scheme for flow 
derivation however, does not provide the generation of an optical flow function; rather 
it yields a single velocity characterizing the entire image for all time. To make the 
computation of a time-varying and local image flow function possible, numerous 
regionally computed 3-D Fourier transforms could be employed instead of just one global 
transform. Then a regional velocity could be determined for each of the regional Fourier 
transforms in the manner described above. 
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Figure 5.1 The velocity planes in the STF (spatiotemporal-frequency) Space. 



The 3-D Fourier domain analysis of image motion is useful for several reasons, in 
particular: 

1) It suggests a method for determining motion (velocity) parameters by 
examination of the nonzero region of F(w^,w^,wJ. 

2) It allows the design of a frequency domain filter for spatiotemporal interpolation 
of images [Ref. 4:pp. 61-66]. 
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B. VELOCITY TUNED FILTERS 



The form of velocity-tuned filters is motivated by the frequency-domain description 
of images which represent constant-velocity, rigid-body translation of objects [Ref. 15]. 
Although this is a very restrictive assumption, the results form a good foundation for 
considering more general types of motion. Motivated by the form of F(w„Wy,wJ, which 
defines a plane in 3-dimensional frequency space, a velocity-tuned filter is defined as one 
which passes the Fourier transform of a constant-velocity function with unity gain and 
has zero gain over the rest of the frequency space. The filter will be supported on a 
unique planar surface defined by the velocity components of the image. However, such 
a filter turns out to have limited usefulness and is physically unrealizable. A better and 
more general approach is to use a finite bandwidth filter, one which has approximately 
unity gain in a solid 3-dimensional region surrounding the planar surface of F(w„Wy,wJ 
and goes to zero outside that region. So, we may consider the support of the 
3-dimensional filter to be bounded by two planes which are parallel to the central plane 
and separated in the w, direction by a temporal bandwidth of F/wJ. The velocity-tuned 
filter will also pass constant-velocity functions whose velocity is different than the 
velocity to which the filter is tuned; this requires only that the filter transfer function 
H(w,,Wy,wJ is approximately constant over the support of the input. The range of velocity 
over which this is true gets smaller as the temporal bandwidth of F/wJ decreases and as 
the spatial bandwidth of the input increases. Specifically the velocity range can be 
defined as. 
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and 



(5.5) 



B B 

\ - u' \ < — ^ and I - v* I < — 

■* y 

Here i/,V are the velocity values to which the filter is tuned and i/.v* are the velocity 
values of the input [Ref. 15;p. 64], B, is the temporal bandwidth of F/wJ, and are 
the spatial bandwiths of the original function. The quantities B/W, and B/Wy collectively 
represent the effective velocity bandwidth of the filter when applied to this input. When 
the conditions of Equation 5.5 are not met, the Fourier transform of the input intersects 
the planar boundaries of the bandpass filter, and the frequency components outside the 
passband are attenuated. 
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VI. SIMULATION RESULTS AND CONCLUSIONS 



The algorithms developed in the previous chapters are simulated with artificial and 
real image frames where MATLAB is used as a tool for simulation. The artificial image 
frames were 32x32 pixels in size and contained a square to represent the moving object. 
Sixteen shades of gray levels are used to represent the pixel intensity values of the square 
object and the background, although the intensity values are normalized before applying 
each algorithm. The algorithms are tested under two different background conditions. For 
the first case the image frames are assumed to have uniform background with zero 
intensity and for the second case a random checkerboard background with half the 
intensity of the moving object is used. A random checkerboard background is used in 
order to avoid any regular or periodic pattern in the background. The artificial images 
used in the simulations are shown in Figures 6. 1-6.4. 

Throughout the simulations, an artificial disturbance (noise) is added to the image 
frames (pixel values). The Relative Error in the estimation of velocity components is 
used as a comparison criteria for different SNR and different algorithms. The SNR (in 
decibels, dB) for an NxN image is defined as. 



N-l N-l 



\ 




( 6 . 1 ) 



jt=0 y=0 
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where is the desired component (actual pixel value), and n(x,y) is the undesired 
component (noise). The final pixel values of the image frames are determined by the sum 
of these two terms. The Relative Error is defined as, 

Relative Error = (6.2) 

Il«ll2 

where u is the actual value of velocity component, u is the estimated velocity component 
and II • I 2 defines the Frobenius norm of a vector or matrix. Each algorithm is tested 
4000-8000 times with very slowly changing SNR’s. The resulting values of these tests 
are averaged over larger changes of SNR’s. In each test, the algorithms are iterated 32 
times. Since the image size was 32, the moving object is simulated such that as it 
disappeared from one end of the image frame it is allowed to reappear at the other end. 

These algorithms are also tested with a real image sequence containing a moving 
car. This real image was 128x128 pixels in size and contained 256 shades of gray to 
represent the pixel values. These tests are done just to see if the algorithms would work 
with real images. Also since the EKF and the 3-D FFT algorithm (with Velocity Tuned 
Filter) promises restoration of the image and increase in SNR’s, these two algorithms are 
tested with the real image sequences. 
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Figure 6.1 The test image frame containing a square object without noise. 




Figure 6.2 The test image frame containing a square object with noise. 
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Figure 6.3 The test image frame including background without noise. 




Figure 6.4 The test image frame including background with noise. 
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A. THE EXTENDED KALMAN FH.TER 



The extended Kalman filter is implemented with a 3"* order filter for images with 
zero (or uniform) background images and with a 5* order filter for images with nonzero 
background as was discussed in Chapter II. 

Two sample test results of the EKF algorithm are shown in Figures 6.5-6.6. The 
EKF gives good results even under fairly low SNR conditions. In general at high SNR 
situations the EKF converges to the actual velocity values after 3-4 iterations and as the 
SNR decreases the iterations needed increases up to 25-30 iterations. At very low SNR 
conditions (-30,-40 dB), the EKF estimate of the velocity components is not guaranteed 
and depends on the random disturbance. Figure 6.7 shows the Average Relative Error 
(computed by averaging the relative error over 32 iterations) and the Final Relative Error 
(the relative error in the final velocity estimate) versus the SNR of the input image 
frames. Figure 6.8 shows the Relative Error versus Iterations for some selected values 
of SNR. Figures 6.9-6.10 shows similar results where the image frames included the 
background (5* order EKF). As seen from these two sets of results the background 
effected the performance of the algorithm. There are mainly two reasons for that. First, 
the occluding boundaries create a noise term which is not included in the SNR term. 
Second, the background which has much more signal power than the moving object, 
makes it harder to detect the phase changes due to the moving object. Similar effects are 
seen in other algorithms that use Fourier phase changes as a basis for velocity estimation. 
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Figure 6.5 The EKF velocity estimates for a high SNR case. 
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Figure 6.6 The EKF velocity estimates for a low SNR case. 
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Figure 6.7 The Relative Error vs. SNR with a 3*^ order EKF. 
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Figure 6.8 The Relative Error vs. Iterations for a 3"^ order EKF. 
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Figure 6.9 The Relative Error vs. SNR for a 5* order EKF. 
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Figure 6.10 The Relative Error vs. Iterations for a 5 '^ order EKF. 
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B. THE IMAGE FLOW CONSTRAINT EQUATION METHODS 



The iterative solution given by Equations 3.16 - 3.17 and Constraint Line 
Clustering method are used for the computation of the Image Flow Constraint equation 
in the simulations. The iterative solution method is implemented in two different ways. 
For the first case only two frames are used and the computations are iterated for 1 , 4, 
16, and 32 times. The estimated velocity fields are shown as arrow diagrams in Figure 
6.11. Initially, the velocity estimates are assumed to be zero. Consequently, the first 
iteration shows vectors where the temporal changes occur. Later the estimates approach 
the correct values in most parts of the image. A few changes occur after 16 iterations. 
In the next case, one iteration is used between two consecutive frames (per time step). 
The resulting velocity fields are shown in Figure 6.12 for 1, 4, 16, and 32 time steps. 
The estimates approached the correct values more rapidly and very few changes occur 
after 16 time steps. This leads to the final method where the velocity estimates are 
computed 4 times between two consecutive frames with 16 or 32 time steps. The velocity 
fields after 1, 4, 16, and 32 time steps, where 4 iterations per time step are used, is 
shown in Figure 6.13. The final estimates of the velocity components of the moving 
object are computed simply by averaging the velocity field in pixel locations that have 
significant temporal changes (i.e. where the moving object is). This created a 
thresholding problem to determine the region of motion especially for the high noise 
case. Since both the artificial image and the real image had higher intensity values at the 
object points than the background, the problem is solved rather easily. But as the noise 
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level is increased to higher values, the error in computing the regions of motion 
increased the error in the velocity estimates. Since the new estimates at a particular point 
do not directly depend on the previous estimates at the same point, increasing the time 
steps does not effect (or improve) the final velocity estimates. Although it improves the 
estimate of the velocity field. 

Constraint Line Clustering methods do not use a previous velocity estimate at all. 
Consequently it is computed between two consecutive frames. Again the final velocity 
is estimated by averaging the velocity field in the region of motion. Figure 6.14 shows 
the velocity field computed by the Constraint Line Clustering algorithm. 

Both algorithms show good performance when the noise levels are low even if the 
image has a finite number of discontinuities (or when the image is not smooth over all). 
However, the performance decreases quickly with increasing noise levels. Therefore, the 
image frames are smoothed with a Gaussian lowpass filter implemented in the frequency 
domain to improve the velocity field estimates. The Relative Error vs. SNR for these 
algorithms are shown in Figures 6.15-6.16 for image frames with 2 different choices of 
backgrounds respectively. 
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Figure 6.11 The velocity field after 1, 4, 16, and 32 iterations. 
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Figure 6.12 The velocity field after 1, 4, 16, and 32 time steps. 
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Figure 6.13 The velocity field after 1, 4, 16, and 32 time steps (4 iterations per time 
step). 
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Figure 6.15 The Relative Error vs. SNR for Image Flow Constraint Eq. solutions. 
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Figure 6.16 The Relative Error vs. SNR for Im. Flow Const. Eq. with background. 
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C. THE SPATIOTEMPORAL FREQUENCY (3-D FFT) ALGORITHM 



This algorithm requires a large memory space since the previous image frames (or 
their 2-dimensional FFT’s) need to be saved in order to perform the FFT in the third 
dimension. Also, the definition and detection of the plane in the 3-dimensional 
spatiotemporal frequency space is another problem to be solved. In the simulations the 
planes are defined by their slopes in the vv„, and directions. These slopes are extracted 
by detecting the slope of the lines created by the intersection of the defined plane with 
the vertical planes normal to the w„ and vv,, axes. The slopes are determined as integer 
values which caused the final velocity estimates to be integer values as well. This is an 
acceptable situation since the actual velocity values are accepted as integer values too. 
The velocity estimates are affected by the number of frames available. Although it was 
not mentioned before, the effect of smearing (or leakage) is seen around the central 
plane because of missing image frames. Besides the expected nonzero central plane, the 
3-D FFT creates other planes which are parallel to this central plane. These planes are 
also supported by the noise terms. One test of velocity estimation with the 3-D FFT 
algorithm is shown in Figure 6. 17. Figures 6. 18-6. 19 show the Relative Error values vs. 
SNR and number of iterations for this algorithm. 

The 3-D FFT algorithm can be used for images with nonzero background as well. 
In this case, the algorithm creates a second nonzero plane with the slope of zero due to 
the constant background. The actual plane can be determined by deleting the plane 
associated with the constant background first. The smearing effect is also seen around 
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the plane associated with the constant background. This makes the detection of the 
desired plane harder and creates errors in the velocity estimates. The simulation results 
of the Spatiotemporal Frequency algorithm with image sequences including background 
are shown in Figures 6.20-6.21. 



3-D FFT (STF) ALGORITHM 
7 I 1 1 1 1 




ITERATIONS 



Figure 6.17 Velocity estimates with the Spatiotemporal Frequency algorithm. 



69 




Figure 6.18 The Relative Error vs. SNR for 3-D FFT algorithm. 
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Figure 6.19 The Relative Error vs. Iterations for 3-D FFT algorithm. 
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Figure 6.20 The Relative Error vs. SNR for 3-D FFT Algorithm (with background). 




Figure 6.21 The relative Error vs Iterations for 3-D FFT algorithm (with background). 
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D. THE 1-DIMENSIONAL EFT ALGORITHM 



This algorithm is the simplest, as mentioned before, but still gives fairly good 
velocity estimates. It takes much less computation time since it uses only one-dimensional 
Fourier transforms. Throughout the simulations, the estimates of this algorithm are also 
improved in two steps. First, the velocity components are found by averaging the results 
for 3 k, values (k. = 1,2,3). Second, the velocity components are computed by Spectral 
Estimation (MEM) methods as discussed in Chapter IV for k=l- The final velocity is 
estimated by comparing the MEM estimate with the average estimates. A sample test 
result is shown in Figure 6.22. For high SNR, even two frames are enough to yield a 
solution. But as the noise level increases more frames are needed. Overall this algorithm 
gives better results for the estimation of motion parameters than the other algorithms. 
The Relative Error vs. SNR and Iterations are plotted in Figures 6.23-6.26 for two 
choices of background. 
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Figure 6.22 Velocity estimates with 1-D FFT algorithm. 
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Figure 6.23 The Relative Error vs. SNR for 1-D FFT algorithm. 
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Figure 6.25 The Relative Error vs. SNR for 1-D FFT algorithm (with background). 
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Figure 6.26 The Relative Error vs. Iterations for 1-D FFT algorithm (with 
background). 
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E. RESTORATION AND IMAGE ENHANCEMENT 



The test image frames are restored by using the estimates of the EKF and by using 
the Velocity Tuned Filter (VTF) as outlined in Chapter V. The artificial images with 
decreasing SNR are used in the simulations. After each simulation, the SNR of the 
restored (or output) image frames are computed and compared with the SNR of the noisy 
(or input) image. The difference between these SNR values are plotted in Figures 6.30- 
6.31 to show the increase (or decrease) in the image quality. As seen from these plots 
the EKF does not improve the image quality for images with SNR’s higher than 10 dB. 
This occurs since some of the higher spatial frequencies are truncated in the EKF 
algorithm. Similiarly, the results show an increase in image quality for images with SNR 
terms lower than -20 dB. However when the restored images are checked, it is seen that 
the increase in SNR is not a real increase in image quality, instead, it is an increase 
related to lowpass filtering of the image caused by truncating frequencies. The same 
effect is seen for VTF after -25 dB SNR’s. Figures 6.26-6.29 show the original, noisy, 
and the restrored image frames of a sample run of these algorithms. 
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Figure 6.26 The original image frame without noise. 




Figure 6.27 The original image with additive noise (SNR = -15 dB). 



Figure 6.28 The restored image with EKF (SNR = -10 dB). 




Figure 6.29 The restored image vith VTF (SNR = -7.5 dB). 
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Figure 6.30 The increase in SNR with EKF. 
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Figure 6.31 The increase in SNR with VTF. 
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F. CONCLUSIONS 



As seen from the simulation results, the 1-D FFT algorithm gives the best result 
for the estimation of velocity components, although this algorithm does not preserve the 
original image. The EKF gives the second best result for the estimation of velocity 
components for low SNR images and also increases the image quality. The 3-D FFT 
(STF) algorithm does not perform as well as the others but promises an increase in the 
image quality with Velocity Tuned Filters (VTF). The algorithms that implement the 
Image Flow Constraint equation performed after smoothing. Additionally, these 
algorithms produce a velocity field besides only the estimation of a single 
two-dimensional velocity component. 



80 



Appendix A. Derivation of Image Flow Constraint Equation 
Let E(x,y,t) be the iiradiance at time t at the image point (x,y). Then if the image 
point is displaced a distance 5x in the x-direction and 6y in the y-direction in time 6t, we 
expect that the irradiance will be the same at time t+6t at the point (x+5x,y+5y). So, 

E(x,y,t) = E(x+bx,y+by,t+6t). (A.l) 

The right hand side of Equation A.l can be expanded about a point (x,y,t) using Taylor 
series and so obtain, 



E(x,y,t) + —bx + —by + —bt + e = E(x,y,0, (A.2) 

dx dy dt 

where e contains second and higher order terms in 5x,6y and 6t. After substracting 
E(x,y,t) from both sides and dividing through by 5t, the result is. 



dx bt dy bt 



+ M = cj(6r), 
dt 



(A.3) 



where co(6t) is a term of order 5t that includes second and higher variations of x and y. 
In the limit as 5t 0, the above equation becomes. 



Eu+Ev*E=0. (A.4) 

A ^ • 

This equation is called the image flow constraint equation as mentioned before. 
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Appendix B. Fourier Transform of a Linearly Translating image 
Let f(x,y,t) represent the time varying image function that contains a uniformly 
translating object as given by Equation 3.1, 



Aw) =Ax-v^yy-Vyt) 

=Ax,y) * Hx-Vj,y-Vyt) 
The 3-dimensional Fourier transform of f(x,y,t) is given by, 



(B.l) 



= j j f dx dy dt. (B-2) 



Using the representation given in Equation B.l gives. 



F(w^,Wy,w) J^x,y)*b(x-vJ,y-Vyt)]e'^^'^^^^’'*‘^'^dxdydt, (B-3) 



and 



F(w^,Wy,w)= Jr, 



F(w^,w^,w)= I 






dt. 



(B.5) 



Performing the integrals inside the brackets gives, 



F(w^,yVy,w^) = f { F(w^,wp } dt, (B-6) 
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and, 



=F(w^,Wy) J 



This yields the final result as, 

F(w^,Wy,w) =F(w^,w^ 



(B.7) 



(B.8) 
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